Characteristics of hand and machine-assigned scores
نویسندگان
چکیده
Assessment of learning in higher education is a critical concern to policy makers, educators, parents, and students. And, doing so appropriately is likely to require including constructed response tests in the assessment system. We examined whether scoring costs and other concerns with using open-end measures on a large scale (e.g., turnaround time and inter-reader consistency) could be addressed by machine grading the answers. Analyses with 1359 students from 14 colleges found that two human readers agreed highly with each other in the scores they assigned to the answers to three types of open-ended questions. These reader assigned scores also agreed highly with those assigned by a computer. The correlations of the machine-assigned scores with SAT scores, college grades, and other measures were comparable to the correlations of these variables with the hand-assigned scores. Machine scoring did not widen differences in mean scores between racial/ethnic or gender groups. Our findings demonstrated that machine scoring can facilitate the use of open-ended questions in large-scale testing programs by providing a fast, accurate, and economical way to grade responses. Until the turn of the 21st century, most large-scale K-12 and college testing programs relied almost exclusively on multiple-choice tests. There are several understandable reasons for this. It takes much longer to score the answers to essay and other “constructed response” (which are also referred to below as “open-ended” or “free-response”) questions than it does to have a machine scan multiple-choice answer sheets. Thus, hand scoring of essay answers tends to increase the time required to report results. There also are concerns about subjectivity in grading because human readers do not always agree with each other (or even with themselves over time) in the score they assign to an answer. Scoring costs and logistical problems (such as arranging for readers) are much greater with open-ended tests than they are with multiple-choice exams. In addition, score reliability per hour of testing time is generally greater with multiple-choice tests than it is with open-ended ones (Wainer and Thisssen [19] and Klein and Bolus [5]). Nevertheless, there are important skills that can only be assessed (or assessed well) with open-ended measures. This is especially so in higher education. Consequently, college and graduate school Dr. Laura Hamilton from RAND, Professor Richard Shavelson from Stanford University, and Professor George Ku from Indiana University provided many helpful suggestions on earlier drafts of this chapter. Dr. Roger Bolus, a consultant to the Council for Aid to Education, ran the statistical analyses presented in this chapter. GANSK & Associates, 120 Ocean Park Blvd., #609, Santa Monica, CA 90405, USA, e-mail:
منابع مشابه
Content Analysis of the structure of addiction treatment centers in Iran
Target:In this research, the structure of addiction treatment centers in Iran has been investigated by reviewing the views of the expert in this field.Method: In this research is used directed qualitative content analysis with interview. conclusion:According to the results, structures of Congress 60 and Iran Region of Narcotics Anonymous are closer to the structure of special mission. Due to li...
متن کاملCommon Spatial Patterns Feature Extraction and Support Vector Machine Classification for Motor Imagery with the SecondBrain
Recently, a large set of electroencephalography (EEG) data is being generated by several high-quality labs worldwide and is free to be used by all researchers in the world. On the other hand, many neuroscience researchers need these data to study different neural disorders for better diagnosis and evaluating the treatment. However, some format adaptation and pre-processing are necessary before ...
متن کاملValidation of the utilization of a specific spray machine to apply general herbicide (Glyphosate) for controlling weeds in chickpea farms in dry land areas of Iran
Weeds are a serious problem of chickpea cultivation in rain-fed areas of Iran and economic feasibility of crop production is mostly challenged by the method of control. In this study, two types of weed control strategies which are common in the country, including hand removing and hand removing + mechanical application, were compared with application of general herbicide (Glyphosite) using a sp...
متن کاملBrain-machine interface in chronic stroke rehabilitation: a controlled study.
OBJECTIVE Chronic stroke patients with severe hand weakness respond poorly to rehabilitation efforts. Here, we evaluated efficacy of daily brain-machine interface (BMI) training to increase the hypothesized beneficial effects of physiotherapy alone in patients with severe paresis in a double-blind sham-controlled design proof of concept study. METHODS Thirty-two chronic stroke patients with s...
متن کاملCan hand dexterity predict the disability status of patients with multiple sclerosis?
Background: Multiple Sclerosis (MS) is the most common disabling neurological disease. Hand dysfunction is one of the main complaints of patients with MS. The present study aimed to compare hand dexterity of MS patients with low Expanded Disability Status Scale (EDSS) scores and healthy adults. It also sought to identify the predictors of disability status of patients with MS based on their m...
متن کامل